1. Technical Field
This invention relates to the field of speech recognition, and more particularly, to enrolling users in a speech recognition system.
2. Description of the Related Art
Speaker-dependent speech recognition systems (SRS) utilize a process called enrollment for processing user speech with improved accuracy. During enrollment, the user is asked to provide a speech sample to the SRS. Typically, the speech sample is derived from the user speaking a known body of text, called an enrollment script, into a microphone. The user speech sample can be processed to develop acoustic models tailored to the user. The acoustic models then can be used by the SRS to more accurately process subsequent speech from the user.
Users can be enrolled in a SRS using one of several different enrollment techniques. One enrollment technique involves the SRS presenting the user with text from an enrollment script. The user then reads the text aloud into a microphone. The SRS can record the speech for processing against the known enrollment script. Asking the user to read an enrollment script aloud, however, does have disadvantages. One such disadvantage is that reading can be difficult for users who have learning disabilities or for users who may not be proficient in reading. Additionally, reading an enrollment script requires a visual interface.
Another enrollment technique is to play portions of the enrollment script phrase by phrase through an audio interface. After each phrase is played, the user repeats the phrase back to the SRS. Thus, the user speech sample can be collected phrase by phrase until the user has dictated the entire enrollment script. The enrollment technique of iteratively playing a phrase and receiving user speech can be useful for users who are unable to read effectively or for users who must interact through an audio only interface. Still, in many cases, this enrollment technique increases the already significant enrollment time by a factor of two.